Robust Classification with Interval Data

نویسندگان

  • Laurent El Ghaoui
  • Gert R.G. Lanckriet
  • Georges Natsoulis
چکیده

We consider a binary, linear classification problem in which the data points are assumed to be unknown, but bounded within given hyper-rectangles, i.e., the covariates are bounded within intervals explicitly given for each data point separately. We address the problem of designing a robust classifier in this setting by minimizing the worst-case value of a given loss function, over all possible choices of the data in these multi-dimensional intervals. We examine in detail the application of this methodology to three specific loss functions, arising in support vector machines, in logistic regression and in minimax probability machines. We show that in each case, the resulting problem is amenable to efficient interior-point algorithms for convex optimization. The methods tend to produce sparse classifiers, i.e., they induce many zero coefficients in the resulting weight vectors, and we provide some theoretical grounds for this property. After presenting possible extensions of this framework to handle label errors and other uncertainty models, we discuss in some detail our implementation, which exploits the potential sparsity or a more general property referred to as regularity, of the input matrices.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proposing a Robust Model of Interval Data Envelopment Analysis to Performance Measurement under Double Uncertainty Situations

It is very necessary to consider the uncertainty in the data and how to deal with it when performance measurement using data envelopment analysis. Because a little deviation in the data can lead to a significant change in the performance results. However, in the real world and in many cases, the data is uncertain. Interval data envelopment analysis is one of the most widely used approaches to d...

متن کامل

A Bootstrap Interval Robust Data Envelopment Analysis for Estimate Efficiency and Ranking Hospitals

Data envelopment analysis (DEA) is one of non-parametric methods for evaluating efficiency of each unit. Limited resources in healthcare economy is the main reason in measuring efficiency of hospitals. In this study, a bootstrap interval data envelopment analysis (BIRDEA) is proposed for measuring the efficiency of hospitals affiliated with the Hamedan University of Medical Sciences. The propos...

متن کامل

Interval network data envelopment analysis model for classification of investment companies in the presence of uncertain data

The main purpose of this paper is to propose an approach for performance measurement, classification and ranking the investment companies (ICs) by considering internal structure and uncertainty. In order to reach this goal, the interval network data envelopment analysis (INDEA) models are extended. This model is capable to model two-stage efficiency with intermediate measures i...

متن کامل

Intelligent and Robust Genetic Algorithm Based Classifier

The concepts of robust classification and intelligently controlling the search process of genetic algorithm (GA) are introduced and integrated with a conventional genetic classifier for development of a new version of it, which is called Intelligent and Robust GA-classifier (IRGA-classifier). It can efficiently approximate the decision hyperplanes in the feature space. It is shown experime...

متن کامل

Multi-Group Classification Using Interval Linea rProgramming

  Among various statistical and data mining discriminant analysis proposed so far for group classification, linear programming discriminant analysis has recently attracted the researchers’ interest. This study evaluates multi-group discriminant linear programming (MDLP) for classification problems against well-known methods such as neural networks and support vector machine. MDLP is less compli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003